Subspace Embeddings and \(\ell_p\)-Regression Using Exponential Random Variables
نویسندگان
چکیده
Oblivious low-distortion subspace embeddings are a crucial building block for numerical linear algebra problems. We show for any real p, 1 ≤ p < ∞, given a matrix M ∈ R with n ≫ d, with constant probability we can choose a matrix Π with max(1, n)poly(d) rows and n columns so that simultaneously for all x ∈ R, ‖Mx‖p ≤ ‖ΠMx‖∞ ≤ poly(d)‖Mx‖p. Importantly, ΠM can be computed in the optimal O(nnz(M)) time, where nnz(M) is the number of non-zero entries of M . This generalizes all previous oblivious subspace embeddings which required p ∈ [1, 2] due to their use of p-stable random variables. Using our matrices Π, we also improve the best known distortion of oblivious subspace embeddings of l1 into l1 with Õ(d) target dimension in O(nnz(M)) time from Õ(d) to Õ(d), which can further be improved to Õ(d) log n if d = Ω(logn), answering a question of Meng and Mahoney (STOC, 2013). We apply our results to lp-regression, obtaining a (1 + ǫ)-approximation in O(nnz(M) logn) + poly(d/ǫ) time, improving the best known poly(d/ǫ) factors for every p ∈ [1,∞) \ {2}. If one is just interested in a poly(d) rather than a (1 + ǫ)-approximation to lp-regression, a corollary of our results is that for all p ∈ [1,∞) we can solve the lp-regression problem without using general convex programming, that is, since our subspace embeds into l∞ it suffices to solve a linear programming problem. Finally, we give the first protocols for the distributed lp-regression problem for every p ≥ 1 which are nearly optimal in communication and computation.
منابع مشابه
Subspace Embeddings and ℓp-Regression Using Exponential Random Variables
Oblivious low-distortion subspace embeddings are a crucial building block for numerical linear algebra problems. We show for any real p, 1 ≤ p <∞, given a matrix M ∈ Rn×d with n d, with constant probability we can choose a matrix Π with max(1, n1−2/p)poly(d) rows and n columns so that simultaneously for all x ∈ R, ‖Mx‖p ≤ ‖ΠMx‖∞ ≤ poly(d)‖Mx‖p. Importantly, ΠM can be computed in the optimal O(n...
متن کاملTight Bounds for $\ell_p$ Oblivious Subspace Embeddings
An lp oblivious subspace embedding is a distribution over r × n matrices Π such that for any fixed n× d matrix A, Pr Π [for all x, ‖Ax‖p ≤ ‖ΠAx‖p ≤ κ‖Ax‖p] ≥ 9/10, where r is the dimension of the embedding, κ is the distortion of the embedding, and for an n-dimensional vector y, ‖y‖p = ( ∑n i=1 |yi|) 1/p is the lp-norm. Another important property is the sparsity of Π, that is, the maximum numbe...
متن کاملSubspace Embeddings for the Polynomial Kernel
Sketching is a powerful dimensionality reduction tool for accelerating statistical learning algorithms. However, its applicability has been limited to a certain extent since the crucial ingredient, the so-called oblivious subspace embedding, can only be applied to data spaces with an explicit representation as the column span or row span of a matrix, while in many settings learning is done in a...
متن کاملRobust blind methods using $\ell_p$ quasi norms
It was shown in a previous work that some blind methods can be made robust to channel order overmodeling by using the l1 or lp quasi-norms. However, no theoretical argument has been provided to support this statement. In this work, we study the robustness of subspace blind based methods using l1 or lp quasi-norms. For the l1 norm, we provide the sufficient and necessary condition that the chann...
متن کاملExperimental study for the comparison of classifier combination methods
In this paper, we compare the performances of classifier combination methods (bagging, modified random subspace method, classifier selection, parametric fusion) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are: (a) combination function among input variables, (b) correlation between input variables, (c) varianc...
متن کامل